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ACROSS THE COUNTRY, states and school districts are working to establish 


new systems for assessing teachers. Promising more accurate, reliable, and useful data 


than evaluation systems of the past, nearly all are designed to incorporate multiple 


measures of teacher performance. Thirty-two states and the District of Columbia, at 


the urging of federal Race to the Top grant requirements, have established policies 


that require evaluation systems to use multiple measures. And the 43 states that were 


granted waivers from requirements of 
No Child Left Behind have promised to 
develop evaluation systems with multiple 
measures. While measures based on 
student test scores have garnered the 
most public attention, measures based 
on observation—watching and rating 
teachers’ classroom performance in 
real time—remain the most important 
component of teacher evaluation systems. 
The oldest and most common approach 
to assessing teaching, observation allows 
an evaluator to make a direct, specific 
assessment of instruction in context and 
as it occurs. Done well, observations 
provide precise, timely, and actionable 
feedback that helps teachers understand 
and improve their practice. And, unlike 


measures of student achievement, districts 


can use well-designed and -executed 
classroom observations to evaluate all 
teachers irrespective of grade level or 


subject. 


An evaluation system that is designed to 
support and improve teacher practice— 
rather than simply to assess and manage 
teacher performance—would have as its 
foundation effective and reliable pro- 
cesses to observe classroom practice and 
give teachers useful feedback. Such an 
improvement-focused evaluation system 
should be aligned with and contribute 
to a district-wide system that includes 
a shared definition of teacher quality, a 
clear set of district priorities, and a co- 
herent strategy for improving teacher 


practice. 


High-quality observations demand skilled observers 
who deeply understand a common frame of refer- 
ence for quality teaching (the rubric their district 
uses), who reliably interpret classroom evidence ac- 
cording to this rubric, and who give teachers timely 
feedback that is targeted to improve their practice.' 
The challenge for districts and states now is how to 


train and support these observers.” 


What types of training and support do observers 
need, and what is being provided? Does effective 
training improve observer skills and, ultimately, 
classroom practice? To address these questions, we 
examined recent research on observer training and 
explored the training efforts in a sampling of five 
districts: Boston, MA, the District of Columbia, 
Santa Fe, NM, Maricopa County, AZ,’ and New 
Haven, CT.4 What follows are key takeaways from 
our research and from conversations with trainers, 


researchers, and district officials. 
Learning the Rubric 


All five districts offer their observers some form of 
introductory training, delivered in written materials 
or through online or in-person seminars. These ini- 
tial courses typically cover procedural basics, such as 
when and how often observations must occur, what 
processes must be followed, and how to score and 


provide feedback in an online platform. 


But this logistical know-how is secondary to the 
observers’ knowledge and understanding of the ob- 
servation rubric. Jilliam Joe and fellow researchers 
at the Educational Testing Service (ETS) write that 
“the most important thing is making sure the prin- 


cipals have a true understanding of the [observation] 
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instrument.”* In their view, training should build 
not only knowledge of each rubric component, but 
also provide “information about the development, 
validation, and research base for the instrument.” 
Training observers to understand the rubric and its 


many components is a time-consuming process, but 


G6 Training observers to understand 
the rubric and its many components 
is a time-consuming process, but one 
that district officials and observers 

themselves recognize as essential. JY 


one that district officials and observers themselves 
recognize as essential. Assistant Superintendent Lori 
Renfro says that observers in Maricopa County “like 
and want repeated practice with the rubric and el- 
ements.” As part of its initial 30-hour training, 
Maricopa trains its observers to learn the rubric and 
become familiar with its components and requires 
a final assessment of that knowledge that observers 
must pass before they are considered qualified. 


Training on the rubric also helps observers under- 
stand how to accurately identify and record evidence 
during an observation. Joe and her colleagues say 
that good training helps observers understand what 
evidence to collect, as well as how to sort evidence 
into the right dimensions of the rubric. Training, 
the ETS researchers explain, needs to help ob- 
servers “learn to use the lens of the instrument to 
search for and record evidence consistently across 
classrooms.” Observers have to “internalize specific 
observer skills, such as becoming attuned to words 


and behaviors, recognizing evidence as a set of facts 
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without opinion, distinguishing key evidence from 
other evidence, accurately sorting evidence into di- 


mensions, and accurately documenting evidence.” 


Boston’s Observation and Feedback course, devel- 
oped with New Teacher Center, teaches participants 
to use a tool called Content, Strategy, Impact (CSI). 
Using CSI, the observer learns to recognize what 


academic content is being taught, the instructional 


GG Instead of trying to take in 
everything that is happening ina 
classroom and synthesizing a final 

judgement based on those impressions, 
evidence collection is now much more 
systematized and strategic.9Y 


strategy the teacher uses, and what students do or 
produce asa result. “For a lot of evaluators in [Boston 
Public Schools], this is revolutionary,” says Jess 
Madden-Fuoco, coordinator of the course. Instead 
of trying to take in everything that is happening in 
a classroom and synthesizing a final judgment based 


on those impressions, evidence collection in Boston 


is now much more systematized and strategic. The 
same is true in other districts, including Santa Fe. 
Almi Abeyta, deputy superintendent for teaching 
and learning for Santa Fe Public Schools, says that 
the district trains observers to “ground their observa- 
tions of teachers in evidence rather than just making 
normative statements like, “The teacher has good 
classroom management.” Using examples, Santa 
Fe observers must define what “good classroom 
management” actually means in terms specifically 


aligned to its rubric. 
Interpreting Evidence 


Another significant challenge to training observers, 
research shows, is reducing variation in scoring.° 
Observers need to be trained to look for the same 
things, to use similar language for what they see 
and, ultimately, to rate observations reliably. Studies 
suggest that calibration—the process of improving 
scoring accuracy by checking and adjusting rat- 
ings through comparison to model evaluations—is 
key to reducing variation and improving reliability. 
Tysza Gandha, a researcher at the Southern Regional 
Education Board (SREB), says that calibration 


IN FOCUS: NEW HAVEN 


The training that New Haven originally provided for 
observers was simple score validation. Observers 
would watch teacher videos, talk with other observ- 
ers about the scores they would give to the teachers, 
then individually complete an evaluation report that 
an outside group would compare against a standard 
score. 

“Participants did not like this system,” says 
Michele Sherban, assistant principal assigned to 
educator evaluation and development for the New 


Haven Public Schools, “as it was not very true-to- 
life.” So New Haven added the Collegial Calibration 
training, which takes place in actual classrooms and 
goes beyond just ensuring that observers’ ratings of 
teachers are “correct”. 

Now, says Sherban, the observer training system 
is more than just calibrating scores. “That's a part of 
it,” she says, “but we want to move forward [to get 
to] the evidence and feedback they’re providing to 
teachers.” 
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could involve observers watching videos that have 
been pre-scored by expert observers, or two or more 
observers watching a teacher in a live context and 
afterwards comparing their notes and ratings to one 


another’s and sometimes to a standard score. “The 


66 District officials highlighted the 
need for modules specifically designed 
to train observers to score consistently, 

accurately, and without bias. 99 


point,” Gandha says, “is to give observers multiple 
and ongoing opportunities to reflect on their rating 
accuracy and the basis on which they evaluate teach- 
ing, and to increase awareness of potential systematic 
biases influencing their judgments.” Indeed, research 
has shown that the practice of regularly recalibrating 
after initial training significantly improves scoring 


accuracy and consistency.’ 


Given the wide variability in observer background and 
experience, bias is a common problem. “Observers 
are not blank slates,” notes researcher Courtney 


Bell and colleagues at ETS. 


“Most observers are 
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former teachers and, based on that teaching experi- 
ence, have ideas about what counts as high quality 
teaching and learning.” These preconceived notions 
mean observers are almost certain to be disposed to 
judge and score performance differently.® District of- 
ficials highlighted the need for modules specifically 
designed to train observers to score consistently, ac- 
curately, and without bias according to the district’s 
rubric. Though the five districts we examined all ad- 
dress bias in some way during training, only Boston 
offers a separate module wholly devoted to “implicit 
bias,” which is designed to mitigate individual varia- 
tion by helping observers understand how their 


beliefs and attitudes might affect their scoring. 


But all of the districts are intently focused on cali- 
bration. In a typical calibration training, observers 
watch videos or live teaching and score teacher 
performance using a common rubric. They then 
compare their scores against standard scores pro- 
vided by experts, or “master coders”. In some cases, 
observers have a chance to discuss their evidence and 
scores with each other before comparing them to a 
standard score. (How did you reach that score? Why 
did you give a 3 instead of a 2?) Some districts are 
using formal cohorts of observers to conduct exer- 


cises throughout the school year that aim to reduce 


IN FOCUS: SANTA FE 


Santa Fe observers get a double dose of calibration 
training. The state of New Mexico recommends 
that observers participate in a full-day “Calibration 
Rounds’ training, provided by the Southern Regional 
Education Board (SREB), which takes place twice a 
year in actual classrooms. 

Observers practice scoring individually, in small 
groups, and with the whole group of participants, 


and talk through any discrepancies to arrive at con- 
sensus scores. In addition, says SREB trainer Yvonne 
Garcia, “part of the training is to teach the calibration 
process so that it can be replicated with [partici- 
pants’] schools or districts.” 

Before New Mexico began providing the SREB 
training, Santa Fe already offered its own calibration 
training, which observers are still required to attend. 
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variation in observer scores. Maricopa organizes 
observers into “calibration cadres,” groups of peer 
evaluators who together watch and score video dem- 
onstrations of teacher performance. New Haven’s 
Collegial Calibration training requires observers 
to conduct multiple rounds of classroom visits. In 
small groups, participants observe three classrooms, 
debriefing together after each visit. After the third 
visit, each observer completes a full report as if he 
were evaluating that teacher. These reports are en- 
tered into an online platform, where trainers who 
observed the same lesson can monitor observers’ 
scores for accuracy and give feedback to observers on 


the items they identified for teacher improvement. 


Some observers are put off by calibration training. 
Some seasoned administrators, for example, may 
resent that after years of classroom practice and ad- 
ministrative experience they have to essentially prove 
they can accurately judge teacher performance in 
the manner specified by the observation system. But 
most who participate in the training seem to find it 
worthwhile. District officials certainly do. They have 
to pay attention to “rater drift”, says Maricopa’s Lori 
Renfro, if they want a system that’s really reliable 


between observers and over time. 


A Focus on Feedback 


In order for an observation system to improve teach- 
ing practice, a crucial part of evaluation happens 
after the observing and scoring, when the observ- 
er sits down with the teacher to share scores and 


feedback. Ideally, the observer would act as an in- 


G6 Some districts are using formal 
cohorts of observers to conduct 
exercises throughout the school year 
that aim to reduce variation in observer 
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structional coach during this meeting, helping the 
teacher reflect on what went well and why, as well as 
what she might improve upon and how. 

Research suggests that these post-observation 
feedback sessions are essential to the improvement 
of teacher practice, but they are not easy to do well. 
Despite the training time they spend on accurate 
scoring and calibration, observers are often ill- 


prepared to offer actionable, high-leverage feedback 


or to conduct effective and collegial conferences 


IN FOCUS: BOSTON 


In Boston, one of the four required courses for ob- 
servers is Observation and Feedback, a 15-hour, 
in-person training delivered by district administra- 
tors. The course, developed in partnership with 
New Teacher Center, teaches evaluators how to give 
feedback that supports teacher development. 
Previously, some observers “had been giving 
teachers pages of notes with some pieces of feed- 
back, [which was] confusing for teachers,” says Jess 
Madden-Fuoco, coordinator of the course. Now, 


feedback is more targeted and is based on evidence 
about which strategies worked to produce positive 
student outcomes, and which didn't. 

Though the results of the training on teacher 
practice remain to be seen, the response from 
observers has been positive. “They love this course 
because it is...directly connected and highly relevant 
to their work” as observers and developers of teacher 
practice, says Madden-Fuoco. 
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with teachers. Michele Sherban, assistant principal 
assigned to educator evaluation and development 
for New Haven Public Schools, describes what’s 
happening in many districts—where the focus can 
be more on assessing teacher practice rather than 
improving it—when she says that New Haven’s 
observers are “good at the rubric and scoring, they 
can come close to the ‘right score.’ We're great 
evaluators, but not great developers of teachers’ 


practice.” 


Building the capacity of observers and teachers to 
give and receive feedback is critical to the success of 
classroom observations, concludes a recent study of 
state policies and practices by SREB. For example, 
“One challenge is the emotional dimension of giving 
and receiving feedback. Dialogue around observations 
can be uncomfortable and highly emotional for both 


parties; even if most of the comments are positive and 
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suggestions are only constructive, many principals 
and teachers have rarely or never experienced these 
types of ‘courageous conversations,” the authors 


write.” A study of feedback practices by the Carnegie 


66 Building the capacity of observers 
and teachers to give and receive 
feedback is critical to the success 
of classroom observations. JJ 


Foundation for the Advancement of Teaching found 
that teachers across a number of districts saw many 
of these sessions as not only challenging but even 
threatening.'? Teachers described the wait between 
observation and debrief as “excruciating” and the 
feedback they received as neither sufficiently concrete 


nor helpful. 


IN FOCUS: MARICOPA COUNTY 


Maricopa County originally covered feedback 
and conferencing as part of its 30-hour Qualified 
Evaluator Training, along with introductory material 
on the rubric, evidence-collection, and scoring. But 
the time devoted to the topic during initial training 
wasn't enough for observers, training facilitators 
reported. “There simply wasn’t enough time to deliver 
all of the content in a way that could be internalized 
and implemented,” Assistant Superintendent Lori 
Renfro explains. 

So now, separate workshops on feedback and 
conferencing are offered after completion of the 
original training. “This gives [observers] a chance 
to get the basic information [in initial training], 
go out and try it, and run into the problems of 
conferencing,” says Renfro. “Then [they] come back 
for more training” during the workshops. 

Maricopa also plans to offer courses to help 


observers identify high-leverage feedback for each 
element of the evaluation rubric. “We've noticed 
patterns in our evaluators,” says Renfro. “When 
they're new, they'll give feedback [on items] that 
they're comfortable with, rather than the ones that 
give the most bang for your buck.” For example, an 
evaluator with knowledge in an instructional area like 
routines and procedures for classroom management 
may decide to weigh in on that topic, rather than one 
in which she feels less competent. “We've had to 
think about... what are the best [feedback items] to 
choose?” Renfro says. 

Eventually, the results from observer feedback 
and evaluations will be tied to targeted goals for 
teacher improvement, and every teacher will have an 
Educator Goal Plan with specific recommendations 
for action and aligned opportunities for professional 
development. 
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All five districts highlighted the importance of 
training observers to identify and deliver timely 
and meaningful feedback to teachers. Boston, 
Maricopa, and DC conduct training sessions 
explicitly devoted to feedback and conferencing, and 
DC supplies observers with strategies—matched 
to every level of every standard of the evaluation 
rubric—designed to move teachers to the next 
level in their practice. In New Haven and Santa 
Fe, feedback is a significant component of plans to 


expand observer training. Santa Fe’s next step, says 


G6 To accommodate their 
observers’ already packed 
schedules, many districts have 
moved some components of their 
training online. BE 


Abeyta, is “training principals to have productive 
conversations with teachers, reflective conversations 
about improvement.” Officials are asking themselves, 


“What does delivering a specific message for 


improvement look like?” Abeyta says. “We couldn't 
have those conversations with the checklist system 
we had in place for observations before, so we're 


getting there now.” 
Challenges 


Given all that observers need to learn and the 
necessity of ongoing recalibration and support, it 
is no wonder that districts are struggling to design, 
implement, and evaluate quality observer training 
that aligns with district priorities and definitions 
of teacher quality and the purpose of observation. 
What follows are some of the major considerations 


and challenges the five districts are facing. 


Designing Training: Online 

and Collaborative Approaches 

To accommodate their observers’ already packed 
schedules, many districts have moved some 
components of their training online. Online courses 
range from materials that explain the basics of the 


evaluation system and observation protocol to more 


IN FOCUS: WASHINGTON, D.C. 


Washington, DC’s online platform is particularly 
comprehensive. Each standard of the evaluation 
rubric has its own online module, and each module 
teaches observers about the components of the 
standard, how to collect evidence related to the 
standard during observation, and how to interpret 
the evidence collected to arrive at a score for that 
standard. At the end of each module, observers 
score a video of teaching practice; the platform can 
assign an extra task if it appears the observer could 
benefit from additional training. 

After observers learn and successfully score 
a series of three individual standards, they then 


score a video that groups those three standards 
together. “This approach allows new evaluators to 
focus on a particular standard and the evidence 
they should collect, before increasing the difficulty 
to rating multiple standards at once,” Stephanie 
Shultz, director of the Align Teaching and Learning 
Framework training platform, explains. 

School leaders and district staff can access the 
online platform to see how closely observers’ 
scores match the master-coded scores on these 
assessments; this information can be used to better 
differentiate observer training for particular areas of 
need. 
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advanced trainings in which observers watch videos 
to practice collecting evidence and scoring. Districts 
turn to online training to reduce costs, standardize 
content, and make training more accessible and 
convenient. And online, they can more easily collect 
data on what observers are learning and how they 
are performing. Boston, for example, used to host 
its “Evaluation 101” trainings in person, but now 
does so through online modules that observers can 
click through on their own. Angela Rubenstein, 
former director of human capital support services 
for Boston Public Schools, says the district simply 


G6 Districts turn to online training 
to reduce costs, standardize content, 
and make training more accessible 
and convenient. And online, they 
can more easily collect data on what 
observers are learning and how they 


are performing. 9g 


didn't have the capacity to continue training in 
person. And “putting it online helped us codify the 
materials,” she says, “compiling all the best work 


from the [previous] in-person trainings.” 


However, districts still see great value in training 
observers in person. New Haven’s Michele Sherban 
says that her district could meet state requirements 
by having just a single online course. But, she says, 
“that doesn’t really delve deep into what [observers] 
are seeing” in actual classrooms. What matters 
most, she says, is the combination of observing 
teacher practice in real-time plus “the exchanging of 
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ideas and the rich team conversations about what 
really works.” New Haven observers naturally form 
cohorts during their sessions together, Sherban says, 
and these groups support each other throughout 
the year. 


This kind of collaborative approach was described 
by several district officials as essential to observer 
training. Adding a collaborative component to 
classroom training allows participants to observe 
together in small groups and debrief afterwards, 
talking about what they saw and discussing how they 
might provide feedback to the teacher they watched. 
DC uses “Learning Walks”, and though the twice- 
monthly sessions are optional, they have proven 
popular among participants, who consistently 
identify them as among the most helpful features 
of the training program. When administrators are 
refining best practices for feedback, says Stephanie 
Shultz, director of District of Columbia Public 
Schools’ Align Teaching and Learning Framework 
training platform, “dialogue with colleagues is a 
critical element.” It helps with calibration, too. “We 
want to ensure that observers throughout the district 
share a common definition of instructional best 
practice... [which] is inherently collaborative work,” 
Shultz says. Santa Fe takes a more direct approach, 
formally organizing observers into professional 
learning communities that, in addition to calibrating 
together during monthly “instructional rounds”, 
meet regularly to share what they are learning 
from their own formal observations. These regular 
meetings, Abeyta says, “allow us as a district to build 


a common language about what an effective teacher 


looks like and does.” 
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Capacity for Training: Money, 

Time, Resources, and People 

Districts vary in their capacity to provide all the 
necessary components of observer training. Many of 
them turn to their states for support, but it is clear 
that even districts and state agencies cannot do this 
work alone. As a result, the number of vendors offer- 
ing their own ready-made observation rubrics, video 
libraries, calibration training, and other supports is 
on the rise. Private companies, both for profit and 
nonprofit, are joined by regional education laborato- 
ries, principals’ associations, and others to make up a 
burgeoning industry providing support and training 


for teacher observers. 


All five districts have used or currently rely on out- 
side support of some kind, primarily because they 
lack the capacity to do it all themselves.'’ Lacking an 
external provider, districts must design and develop 
training materials, online platforms, and practice 
videos, as well as find qualified facilitators to lead 
and oversee training. Some districts use grants to 
help fund observer training initiatives. DC, for ex- 
ample, used a grant from the Bill & Melinda Gates 
Foundation to build its online training platform, 
and Maricopa County is in the fifth year of a five- 
year federal Teacher Incentive Fund grant (totaling 
$57.8 million) that they have used partly to support 


evaluation and observation. 


But most districts still struggle to find the time and 
staff they need. It can be a challenge for observers 
themselves to dedicate sufficient time to training. 
DC’s solution, Shultz says, is to plan training 
opportunities strategically and make sure that they 
are as convenient as possible to offset the amount 


of time observers need to develop fluency with the 


rubric and scoring. After all, “to do [this work] at 
this level is a significant number of hours to ask 
someone to spend,” says Shultz, “especially if they're 


a new school leader.” 


Though New Mexico requires observers to pass a cer- 
tification test after training, Santa Fe has chosen not 
to require observers to take any additional district- 
level assessments. “Principals have enough on their 
plates,” says Abeyta. And in Maricopa County’s rural 
districts, there simply aren't enough people to offer 
extra support. School administrators wear so many 
hats, Renfro says, that capacity can be even more of 
an issue for them than for urban districts. While New 
Haven has opened up its Collegial Calibration train- 
ing to any observers who want to join the sessions 
(whether they are new to the system or not), only 
a handful of veteran evaluators have chosen to par- 


ticipate because the sessions are so time-consuming. 


66 Most districts struggle to find 
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All this is in addition to the unique challenges of 
working with providers from outside the district. 
New Haven uses the services of the consulting firm 
ReVision Learning, which also provides services to 
the entire state of Connecticut, for training sessions 
that continue throughout the school year. “The idea 
of bringing all these people [ReVision facilitators, 
district trainees, and district and sometimes state ad- 
ministrators] together is a good one,” Sherban says. 
But, she says, “it’s a logistical challenge to bring peo- 
ple together after the initial training.” 
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The challenges extend far beyond logistics. Districts 
must choose between turning to outside exper- 
tise—which may not align with district definitions 
of teacher quality—and developing a home-grown 
system. Some are trying to build in-house capacity 
to run trainings and support observers, and some are 
creating new roles altogether. Boston’s Jess Madden- 
Fuoco, for instance, serves part-time as director of 
instruction at a school while also coordinating the 
district’s Observation and Feedback course and facil- 
itating training sessions. Madden-Fuoco says these 
dual roles are key to the training program’s success. 
“There are real advantages to school-based leaders 
leading this work,” she says, noting that observers 


trust facilitators because they are all school-based 


66 Districts must choose between 
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leaders. “It would be so hard for someone to teach 
a course in the way that we're able to,” she says. 
Because she herself is doing the work that observers 
are doing every day, she knows the system from the 


observer's perspective as well as the trainer’s. 


Other districts are including training responsibili- 
ties in the job descriptions for existing roles. DC’s 
master educators (who are full-time observers and 
evaluators) and experienced school leaders now co- 
facilitate orientation sessions for new principals, 


which include initial training on the evaluation 


rubric and observation. Master educators and school 
leaders who have extensive experience with the 
evaluation framework also serve as “anchor raters” 
for the videos of teacher practice used in the online 
training platform. And in Maricopa County, some 
district staff members have shared facilitator duties 
with the field specialists originally hired for the job. 


New trainers are also paired with mentor trainers. 


Measuring the Impact of Training 

We know that training observers is essential to ensur- 
ing high-quality observation, which in turn is key to 
improving teacher quality. But little is known about 
the outcomes of observer training, and efforts to 
measure these outcomes are nascent at best. Districts 
now primarily collect self-reports from observers on 
their training. District of Columbia Public Schools, 
for example, can say that 100 percent of those who 
complete its online training agree or strongly agree 
that it helped them develop fluency with observation 
and scoring. Boston, says Rubenstein, is collecting 
feedback from observers and teachers so the district 
knows what questions evaluators have about the 
process. But she says it’s hard to know if the training 


is improving observation. 


Surveys may show whether observers believe they 
are well-trained, and quantitative data (based on 
calibration checks) may show whether observers 
are accurately and consistently scoring teachers ac- 
cording to the rubric. But none of the five districts 
have yet managed to link observer training to im- 
provement in teacher practice. Some, including New 
Haven, are considering measuring the effectiveness 
of their training programs by looking at observers’ 
scoring reports over time; they would, in effect, 


« »”» 
score the scorers. 
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Researchers are trying to help districts estimate the 
effect of observation training on teacher practice, 
and, ultimately, on student learning. “We're learning 
as we go,” says Matthew Kraft, an assistant professor 
of education at Brown University who is leading a 
study of teacher observation in Boston. “We want 
user-experience data,” Kraft says, “so we look at pre- 
and post-surveys of principals participating in the 


66 An evaluation system must ensure 
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and be able to provide useful and 
meaningful feedback to teachers.§g 


trainings, surveys of teachers on the quality of feed- 
back they are receiving, and online reporting of the 
number of times people have been evaluated.” But 
he says that while it is “tempting to look at a lot of 
outcomes, we're still not clear which ones are most 
telling.” Efforts like this, by and large, remain an 


area of growth for districts. 
Conclusion 
The rise of more complex teacher evaluation 


systems has meant more requirements for teacher 


observation, and therefore more complex observer 


training systems.'? All five of the districts we 
examined described multiple trainings for observers 
and some form of ongoing support during the school 
year. In these districts training has grown from an 
introductory session into a series of courses that 
cover more content and provide more support on 
topics from evidence collection to post-observation 
conferencing. It is clear, based on research and 
conversations with officials in these districts, that 
observer training is no longer a “one and done” 


system, if it ever was.'? 


But simply adding more training opportunities will 
not in and of itself lead to higher quality observa- 
tion systems. In developing and refining observer 
training, states and districts must not lose sight of 
the purpose of observation and the observer’s role in 
serving that purpose. If the goal is to improve teach- 
er practice, an evaluation system must ensure that its 
observers have a deep and common understanding 
of what quality teaching is, know how to accurately 
identify and reliably interpret evidence of good prac- 
tice across contexts, and be able to provide useful 
and meaningful feedback to teachers. 


Sarah McKay is an associate for public policy engagement at 
the Carnegie Foundation for the Advancement of Teaching. 
Elena Silva is a former senior associate for public policy 
engagement at the Carnegie Foundation for the Advancement 
of Teaching. 
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Districts must make strategic decisions about how to support observers; they 
otherwise run the risk of overloading observers with unnecessary or ineffective 
training. The principles of improvement listed below can help districts target 
training to specific observer needs, align training with other district priorities and 
systems, and measure the effectiveness of training opportunities in order to im- 
prove them. 


Make the work problem-specific and user-centered. It starts with a single 
question: “What specifically is the problem we are trying to solve?” It enlivens 
a co-development orientation: engage key participants early and often. 


Variation in performance is the core problem to address. The critical issue 
is not what works, but rather what works, for whom and under what set of 
conditions. Aim to advance efficacy reliably at scale. 


See the system that produces the current outcomes. It is hard to improve 
what you do not fully understand. Go and see how local conditions shape 
work processes. Make your hypotheses for change public and clear. 


We cannot improve at scale what we cannot measure. Embed measures 
of key outcomes and processes to track if change is an improvement. We in- 
tervene in complex organizations. Anticipate unintended consequences and 
measure these too. 


Anchor practice improvement in disciplined inquiry. Engage rapid cycles of 
Plan, Do, Study, Act (PDSA) to learn fast, fail fast, and improve quickly. That 
failures may occur is not the problem; that we fail to learn from them is. 


Accelerate improvements through networked communities. Embrace the 
wisdom of crowds. We can accomplish more together than even the best of us 
can accomplish alone. 
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ENDNOTES 


" Sandra Park, Sola Takahashi, and Taylor 
White, “Developing an Effective Teacher 
Feedback System,” Carnegie Foundation for the 
Advancement of Teaching, 2014. http://www. 
carnegiefoundation.org/resources/publications/ 
developing-effective-teacher-feedback-system/ 


* For the purpose of this brief, the term “observer” 
is defined as anyone conducting classroom observa- 
tions as part of a school district or state evaluation 
system. 

° “Maricopa County” refers to the 12 districts 

in Maricopa County and greater Phoenix that 

are participating in the Rewarding Excellence in 
Instruction and Leadership (REIL) program, funded 
by federal TIF grants and managed by the Maricopa 
County Education Service Agency (MCESA). 


‘ This is the last in a series of three briefs examin- 
ing emerging trends in teacher evaluation systems. 
The first, “Evaluating Teachers More Strategically: 
Using Performance Results to Streamline Evaluation 
Systems,” examined strategies to differentiate evalu- 
ation based on performance. The second, “Adding 
Eyes: The Rise, Rewards and Risks of Multi-Rater 
Evaluation Systems,” examined how districts and 
states are employing multiple raters to observe and 
evaluate teacher performance. The districts high- 
lighted in this final brief are a geographically diverse 
subset of those profiled in the previous brief, all of 
which had described observation training as a neces- 


sary or immediate goal. 


> Jilliam N. Joe et al., “Foundations of 
Observation: Considerations for Developing an 
Observation System That Helps Districts Achieve 
Consistent and Accurate Scores,” ETS and MET 
Project, Bill & Melinda Gates Foundation, 2013. 
http://www.metproject.org/downloads/MET-ETS_ 
Foundations_of_Observation.pdf 


° Courtney A. Bell et al., “An Argument Approach 
to Observation Protocol Validity,” Educational 
Assessment 17, no. 2-3 (2012): 1-26. Also, Jodi 

M. Casabianca et al., Effect of observation mode 
on measures of secondary mathematics teaching. 
Educational and Psychological Measurement 73 no. 


5 (20113)-7 37-783. 


7 Anne H. Cash et al., “Rater calibration when ob- 
servational assessment occurs at large scale: Degree 
of calibration and characteristics of raters associ- 
ated with calibration,” Early Childhood Research 
Quarterly 27 no. 3 (2012): 529- 42. 


* See Cash, “Rater calibration,” 529-42. Also see 
Courtney A. Bell et al., “Improving Observational 
Score Quality: Challenges in Observer Thinking,” 
in Designing Teacher Evaluation Systems: New 
Guidance from the Measures of Effective Teaching 
Project, eds. Thomas J. Kane et al. (San Francisco: 
Jossey-Bass, 2014): 50-97. 


? ‘Tysza Gandha and Andy Baxter, “Toward 
Trustworthy and Transformative Classroom 
Observations: Progress, challenges and lessons in 
SREB states,” Southern Regional Education Board, 
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ENDNOTES 


2015. http://publications.sreb.org/2015/SREB_ 
COReportOnline.pdf 


10 Jeannie Myung and Krissia Martinez, “Strategies 
for Enhancing the Impact of Post-Observation 
Feedback for Teachers,” Carnegie Foundation for 
the Advancement of Teaching, 2013. http://www. 
carnegiefoundation.org/resources/publications/ 
strategies-enhancing-impact-post-observation-feed- 
back-teachers/ 


"Santa Fe’s calibration training is facilitated by 
Research for Better Teaching, a Boston-based pro- 
fessional development training and consulting firm. 
Santa Fe observers also receive training provided by 
the state of New Mexico and the Southern Regional 
Education Board (see BOX). Boston partnered with 
the New Teacher Center, a nonprofit professional 
development organization funded primarily by 

the Gates Foundation, to develop its Observation 


and Feedback course. Boston is also currently 


using Teachscape’s online platform to develop a 
video library that can be used for calibration. New 
Haven’s Collegial Calibration training is facilitated 
by ReVision Learning. The state of Connecticut 
and the Connecticut Association of School 
Administrators have also contracted with ReVision 
Learning to provide Collegial Calibration training 
to district representatives at no cost to districts. 


2 Schools and districts must adhere to teacher eval- 
uation policies which may, depending on the state, 
stipulate length of observations (e.g., a minimum of 
30 minutes), frequency of observations, instruments 
used for observation, qualifications of observers, and 
whether teachers receive advance warning that an 


observation will occur 


3 ‘Thomas J. Kane, Kerri A. Kerr, and Robert C. 
Pianta, eds., Designing Teacher Evaluation Systems: 
New Guidance from the Measures of Effective 


Teaching Project (San Francisco: Jossey-Bass, 2014). 
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